The combinatorics of tandem duplication
نویسندگان
چکیده
Tandem duplication is an evolutionary process whereby a segment of DNA is replicated and proximally inserted. The different configurations that can arise from this process give rise to some interesting combinatorial questions. Firstly, we introduce an algebraic formalism to represent this process as a word producing automaton. The number of words arising from n tandem duplications can then be recursively derived. Secondly, each single word accounts for multiple evolutions. With the aid of a bi-coloured 2dtree, a Hasse diagram corresponding to a partially ordered set is constructed, from which we can count the number of evolutions corresponding to a given word. Thirdly, we implement some subtree prune and graft operations on this structure to show that the total number of possible evolutions arising from n tandem duplications is n ∏ k=1 (4 − (2k + 1)). The space of structures arising from tandem duplication thus grows at a super-exponential rate with leading order term O(4 1 2n2).
منابع مشابه
The combinatorics of tandem duplication trees.
We developed a recurrence relation that counts the number of tandem duplication trees (either rooted or unrooted) that are consistent with a set of n tandemly repeated sequences generated under the standard unequal recombination (or crossover) model of tandem duplications. The number of rooted duplication trees is exactly twice the number of unrooted trees, which means that on average only two ...
متن کاملrunning head: COUNTING DUPLICATION TREES The Combinatorics of Tandem Duplication Trees
We develop a recurrence relation that counts the number of Tandem Duplication Trees (either rooted or unrooted) that are consistent with a set of n tandemly repeated sequences generated under the standard unequal recombination (or crossover) model of tandem duplications. We find that the number of rooted duplication trees is exactly twice the number of unrooted trees, which means, on average, o...
متن کاملGene Family: Structure, Organization and Evolution
Gene families are considered as groups of homologous genes which they share very similar sequences and they may have identical functions. Members of gene families may be found in tandem repeats or interspersed through the genome. These sequences are copies of the ancestral genes which have underwent changes. The multiple copies of each gene in a family were constructed based on gene duplicati...
متن کاملGenaralized Neighbor Joining Approaches for Reconstructing Tandem Duplication History: a comparitive study
Motivation: Genomes are replete with short sequences repeated consecutively called tandem repeats. Reconstructing duplication histories for tandem repeats may yield valuable insights into their functions and the biological mechanisms of tandem repeat creation and extension. Results: We study the generalized neighbor-joining approaches for reconstructing tandem duplication history. We develop a ...
متن کاملNeighbor Joining Approaches for Reconstructing Tandem Duplication History
Motivation: Genomes are replete with short sequences repeated consecutively called tandem repeats. Reconstructing duplication histories for tandem repeats may yield valuable insights into their functions and the biological mechanisms of tandem repeat creation and extension. Results: we design and implement a set of heuristic algorithms for reconstructing tandem duplication history with neighbor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Discrete Applied Mathematics
دوره 194 شماره
صفحات -
تاریخ انتشار 2015